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Abstract 

For sample covariance matrices with iid entries with sub-Gaussian 
tails, when both the number of samples and the number of variables 
become large and the ratio approaches to one, it is a well-known result 
of A. Soshnikov that the limiting distribution of the largest eigenvalue 
is same as the of Gaussian samples. In this paper, we extend this 
result to two cases. The first case is when the ratio approaches to 
an arbitrary finite value. The second case is when the ratio becomes 
infinity or arbitrarily small. 

1 Introduction 

The scope of this paper is to study the limiting behavior of the largest 
eigenvalues of real and complex sample covariance matrices with indepen- 
dent identically distributed (i.i.d.), but non necessarily Gaussian, entries. 
Consider a sample of size p of i.i.d. x 1 random vectors yi, . . . ,yp. We 
further assume that the sample vectors yk have mean zero and covariance 
E = Id. We use X = [yi, ■■■ ,yp] to denote the N x p data matrix and 
Mn = j/X^* to denote the sample covariance matrix. Random sample 
covariance matrices have been first studied in mathematical statistics ([1], 
[13j . jll]). A huge literature deals with the case where p — > oo, A being 
fixed, which is now quite well understood. Contrary to the traditional as- 
sumptions, it is now of current interest to study the case where A^ is of 
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the same order as p, due to the large amount of data available. In particu- 
lar, the limiting behavior of the largest eigenvalues is important for testing 
hypotheses on the covariance matrix S. Here we focus on the simple case, 
Ho : Yj = Id versus Ha ■ S ^ Id, and study the asymptotic distribution of 
extreme eigenvalues under the Ho- The study of extreme eigenvalues is also 
of interest in principal component analysis. We refer the reader to [15] and 
for a review of statistical applications. Other examples of applications 
include genetics [21], mathematical finance [21], [18], [19], wireless commu- 
nication |34j , physics of mixture [25] , and statistical learning [12] . We point 
out that the spectral properties of Mtv readily translate to the companion 
matrix Wn = jjX*X. Indeed, Wn is a p x p matrix, of rank N, with the 
same non-zero eigenvalues as Mjy. Thus, it is enough to study the spectral 
properties of Mat to give a complete picture of the spectrum of such sample 
covariance matrices. 

1.1 Model and results 

We consider both real and complex random sample covariance matrices 

Mn = ^XX*, 

where X is a N x p, p = p{N) > N, random matrix satisfying certain 
"moment conditions". In the whole paper, we set 7Ar = We assume 
that the entries Xij,l < i < N,l < j < p, oi the sequence of random 
matrices X = X]\j are non-necessarily Gaussian random variables satisfying 
the following conditions. First, in the complex case, 

(i) {3fteXjj, '^mXij : l<i<N,l<j<p} are real independent random 
variables, 

(ii) all these real variables have symmetric laws (thus, E[Xj^^'^^] = for all 
k e W), 

(iii) Vi, j, E[{^eXij)^] = E[(9mXij)'] = 4, 

(iv) all their other moments are assumed to be sub-Gaussian i.e. there exists 
a constant r > such that uniformly in i,j and k, 

In the real setting, X = (Xjj)i<jj<jv is a random matrix such that 
(i') the {Xij, l<i<N,l<j<p} are independent random variables, 
(ii') the laws of the Xij are symmetric (in particular, E[Xj^^^"^] = 0), 
(iii')foralli,j, E[X2^.] = a2, 
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(iv') all the other moments of the Xij grow not faster than the Gaussian 
ones. This means that there is a constant r > such that, uniformly in i, j 
and k, E[X2^] < (rA:)^ 

When the entries of X are further assumed to be Gaussian, we will denote by 
Xq the corresponding model. In this case and in the complex setting, 
is of the so-called Laguerre Unitary Ensemble (LUE), which is also called 
the complex Wishart ensemble. In the real setting, is of the so-called 
Laguerre Orthogonal Ensemble (LOE) or real Wishart ensemble. 



The scope of this paper is to describe the large-A^-limiting distribution 
of the K largest eigenvalues induced by any such ensemble, for any fixed 
integer K independent of A^. Two regimes are investigated in this paper. In 
the first part, we assume that there exists some constant 7 > 1 such that 
limAT^ooTAT = 7- In the second part, we consider the case where 7Ar ^ 00 
as — > cxD. 

Before stating our results, we recall some known results about sample co- 
variance matrices. We first focus on the case where lim7v^oo7Af = 7 < 00. 
Let Ai > A2 > • • • > Aat be the ordered eigenvalues induced by any ensemble 
of the above type. The first fundamental result for the limiting spectral be- 
havior of such random matrix ensembles has been obtained by Marchenko 
and Pastur in [20j (in a much more general context than here). It is in 
particular proved therein that the spectral measure fiN = Si^i ^^A^ a.s. 
converges as goes to infinity. Set = ct^(1 it ^77)^. Then one has that 



, dpMp{x) yj{u% - x){x - n^) 
Noo^^ = PMP a.s., where — = Mut,u%\{x)- 

(1) 

The limiting probability distribution pmp is the so-called Marchenko-Pastur 
distribution. 

The above result gives no insight about the behavior of the largest eigenval- 
ues. The first study of the asymptotic behavior of the largest eigenvalue goes 
back to S. Geman [10]. It was later refined in [2] and ^26j. In particular, it 
is well known that lim7v-+oo Ai = a.s. if the entries of the random matrix 
X admit moments up to order 4. Significant results about fluctuations of 
the largest eigenvalues around u^j. are much more recent and are essentially 
established for Wishart ensembles only. In particular, the limiting distribu- 
tion of the largest eigenvalue has been obtained by K. Johansson [T3] for 
complex Wishart matrices and I. Johnstone [15] for real Wishart matrices. 
A. Soshnikov [30] has derived for both ensembles the limiting distribution 
of the K largest eigenvalues, for any flxed integer K. Before recalling their 
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results, we need a few definitions. We denote by A^^ > > • • • > A^ 
the eigenvalues induced by the Wishart ensembles, with (3 = 2 (resp. (3=1) 
for the LUE (resp. the LOE). We also define the limiting Tracy- Widom 
distribution for the largest eigenvalue. Let Ai denote the standard Airy 
function and q denote the solution of the Painleve II differential equation 
■^^q = xq(x) + 2q'^{x), with boundary condition q{x) ~ Ai{x) as x ^ +00. 

Definition 1.1. The GUE (resp. GOE) Tracy-Widom distribution for the 
largest eigenvalue is defined by the cumulative distribution function F2{x) 



= exp{Xr(x - t)q\t)dt} (resp. F^{x) =e^p{C ^ + ^qHt)dt}). 



The GUE (resp. GOE) Tracy-Widom distribution for the joint distribu- 
tion of the K largest eigenvalues (for any fixed integer K) has been similarly 
defined. We refer the reader to [35] and [36j for a precise definition. 
We then rescale the eigenvalues as follows: for i = 1, . . . , A^, we set 



Theorem 1.1. Jj^| / jllSf 130^ . The joint distribution of the K largest eigen- 
values of the LUE (resp. LOE) rescaled as in (0j converges, as N ^ 00, to 
the joint distribution defined by the GUE (resp. GOE) Tracy-Widom law. 

The proof of Theorem 11.11 relies on the crucial fact that the joint eigen- 
value density of the Wishart ensembles can be exactly computed. Starting 
from numerical simulations, it was then conjectured, in [15] e.g., that The- 
orem 11.11 actually holds for a class of random sample covariance matrices 
much wider than the Wishart ensembles. Such a universality result was 
later proved for some quite general ensembles by A. Soshnikov [30], yet un- 
der some restriction on the sample size, as we now recall. 
For any ensemble satisfying (i) to [iv) (resp. («') to {ii'')), we set: 



Theorem 1.2. jl30^ Assume that p — N = 0{N^^'^). The joint distribution 
of the K rescaled largest eigenvalues fii,i < K, induced by any ensemble 
satisfying (i) to {iv) (resp. {i') to {iv')) converges, as N goes to infinity, to 
the joint distribution defined by the GUE (resp. GOE) Tracy-Widom law. 

In this paper, we prove that such a universality result holds for any value 
of the parameter 7. This is the main result of this note. 





(3) 
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Theorem 1.3. The joint distribution of the K rescaled largest eigenvalues 
fJ'iji < K, induced by any ensemble satisfying (i) to [iv) (resp. {i') to {iv')) 
converges, as N goes to infinity, to the joint distribution defined by the 
GUE (resp. GOE) Tracy-Widom law. The results holds for any value of the 
parameter 7 > 1. 

Remark 1.1. Assumptions (iu) and {ii^') can actually be relaxed. This re- 
laxation is discussed in the second paragraph of Subsection II. 2[ 

Before giving secondary results, we give a few comments on the way we 
proceed to prove Theorem II. 31 In Theorem 1 1.21 the reason for the restriction 
on p — N follows from the idea of the proof used therein. Basically, when 
7 = 1, the eigenvalues of a random sample covariance matrix roughly behave 
as the squares of those of a typical Wigner random matrix. This adequacy 
still works for the largest eigenvalues, but fails if 7 is not close enough to 
one. Theorem 1 1 . 2 1 has been proved using universality results established for 
classical Wigner random matrices. Here, we revisit the problem of comput- 
ing the asymptotics of E [TrM^] for some powers L that may go to infinity, 
using combinatorial tools specifically well suited for the study of spectral 
functions of sample covariance matrices. It is well known that Dyck paths 
and Catalan numbers are associated to standard Wigner matrices (see [1]). 
Suitable combinatorial tools in the case of sample covariance matrices are 
the so-called Narayana numbers and some particular Dyck paths. Using 
those, we can extend the universality result of [30j to any value of the ratio 
7- 

The case where 7 < 1 can also be considered thanks to the companion matrix 
Wn. Let A'j, 1 < i < p, he the eigenvalues of ^X*X, ordered in decreasing 
order and let 6n = = N/p, so that 6n I/7 < 1 as — > 00. We set: 

Corollary 1.1. Under the assumptions (i) to [iv) (resp. (i') to {if')), the 
joint distribution of (fi'i, . . . , ^J-'j^) converges as N ^ 00 to the GUE (resp. 
GOE) Tracy-Widom joint distribution of the K largest eigenvalues. 

The machinery we develop to prove Theorem 11.31 can also be used to 
consider the case where the size of the sample data increases in such a 
way that p,N go to infinity and — > 00. The large- A^-limiting behavior of 
extreme eigenvalues of Wishart matrices for such a regime has been obtained 
by N. El Karoui [6]. The particular interest of such a study for statistical 
applications (e.g. microarrays) is also explained in great detail therein. 
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Theorem 1.4. |^ With the same reseating as in Theorem \l.l\ actually 
holds in the case where ImiN^oolN = oo. 

Under the same assumptions (i) to {iv) (resp. (i') to {iv')), we prove 
that universaUty still holds in the regime p/N oo. 

Theorem 1.5. Theorem \1.3\ also holds i/ lim^v^oo 7Af = oo. 
1.2 Statistical implications of the result 

Testing homogeneity of a population has long been of interest in mathemat- 
ical statistics, and it is often a preliminary step in discriminant analysis and 
cluster analysis. Assuming high dimensionality, we consider the test of the 
null hypothesis Ho : T, = Id vs. the alternative hypothesis Ha : S ^ Id. 
The result stated here for the largest eigenvalue can be formulated as 

hm P(^i <x|i/o) = F2(i)(x). (4) 

The above theoretical result was essentially established for Gaussian samples 
only so far (cf. [16] for a review). Removing the Gaussianity assumption is 
actually fundamental for various statistical problems. Our result may be of 
use for instance in genetics (see |2l] e.g.). Samples in genetic data are usually 
drawn from a distribution with compact support and the size of matrices 
encountered therein is typically large enough so that ([4]) should be observed 
for some appropriate models. Some Gaussian (or other kinds) mixtures also 
fall into the class of distributions studied here. Such distributions occur for 
instance in finance in modeling some fat-tailed returns. 

Regarding especially the assumptions we make on the distribution of the 
entries, they may appear strong for other statistical purposes. The moment 
assumptions (iu) and (iz/') can actually be relaxed, using truncation tech- 
niques. We can show that Theorem 11.31 holds under the assumption that 
P{\Xij\ > x) < C(l -|- Vi, J, for some nio > 36 (see Remarks 12.31 and 
12. 5p . We do not consider this case here, which would increase the techni- 
calities of the paper. To illustrate this, a simulation is given below where 
the entries of X have a Student's distribution with 40 degrees of freedom. 
These distributions may be interesting in statistical models due to their (rel- 
ative) robustness with respect to outliers. 

The symmetry assumption is probably more problematic and is again a 
technical assumption for the proof. Indeed, it is expected that the lack of 
symmetry has no impact on the limiting distribution of largest eigenvalues 
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(provided the distribution is centered). Yet, analytical tools to prove such 
a result are not established (see e.g. [22] for recent progress). 



The method we develop is also a first step towards considering samples 
with non-Identity covariance. Such results are of practical importance for 
understanding the behavior of Principal Component Analysis and dimen- 
sion reduction in high dimensional setting. It is therefore important to 
consider covariance matrices with more complex structure. In particular (in 
progress), the moment approach developed here seems to be well suited in 
the case where the population covariance is a so-called "spiked" diagonal 
matrix. That is, T, = Id + D, where the deformation D is a finite rank 
diagonal matrix. This is important, since the test based on ^ may not 
reject if the largest eigenvalue of D is not large enough, because of a 
phase transition phenomenon described e.g. in [3j. 



A few simulations have been done to give, from a practical point of view, 
an idea of the rate of convergence of the distribution of the largest eigen- 
value. We have generated real random matrices with i.i.d. entries with a 
t— distribution or a Gaussian mixture distribution. To fit the limiting Tracy- 
Widom distribution, we have rescaled the largest eigenvalue as follows: 

.N,p _ NXl-a^{^/WT^+^/pT^f ... 

for some adjustment parameters oi and 02. We indeed have some freedom 
in the choice of these parameters, which can be any fixed real numbers. In 
the real Wishart case, the best parameters are known to be ai = 02 = —0.5. 
Determining the optimal parameters is important to improve convergence 
rates for They have been established in [8j for complex Gaussian sam- 
ples. Providing a general formula for these parameters is an issue that we 
cannot handle so far. Nevertheless, in view of our simulations, the optimal 
parameters may depend on the distribution of the entries and maybe also on 
the dimensions and p. For instance, for the Gaussian mixture distribution, 
we found empirically that the best parameters are ai = 02 = (choosing 
oi = a2 = —0.5 gives results which are similar but not so satisfying). We 
also tested various t— distributions. For the f— distribution with 40 (resp. 
20) degrees of freedom, we found that ai = 02 = —0.5 (resp. ai = 02 = 0) 
were the optimal parameters. Such a change can be understood, as the sim- 
ilarity between the t— distribution and the Gaussian distribution increases 
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Table 1: Gaussian mixture, = 2,4 



Pc 


Fi 


10 X 20 


25 X 50 


50 X 100 


10 X 40 


25 X 100 


50 X 200 


-3.896 


.01 


.0079 


.0103 


.0097 


.0107 


.0114 


.0092 


-3.516 


.025 


.0256 


.0246 


.0270 


.0257 


.0267 


.0236 


-3.180 


.05 


.0554 


.0487 


.0512 


.0568 


.0516 


.0489 


-2.782 


.10 


.1172 


.0973 


.0971 


.1099 


.1004 


.0994 


-2.088 


.25 


.2912 


.2502 


.2478 


.2724 


.2453 


.2436 


-1.269 


.50 


.5354 


.4935 


.4894 


.5154 


.4960 


.4922 


-0.392 


.75 


.7550 


.7401 


.7336 


.7465 


.7440 


.7389 


0.450 


.90 


.8951 


.8897 


.8879 


.8870 


.8892 


.8894 


0.979 


.95 


.9417 


.9396 


.9436 


.9368 


.9422 


.9425 


1.454 


.975 


.9676 


.9662 


.9718 


.9644 


.9718 


.9691 


2.024 


.99 


.9855 


.9857 


.9875 


.9824 


.9875 


.9858 



The first column shows the percentiles of the Fi Tracy- Widom distribution cor- 
responding to the values in the second column. The next 6 columns give the es- 
timated cumulative probabilities for A^''' obtained from 10000 replications. The 
entries of the random matrices are i.i.d. with the Gaussian mixture distribution 
l/2A/'(0, 1) + l/2A/'(0,3) and ai = a2 = 0. The Matlab functions normrnd and 
unif rnd were used to generate the Gaussian mixtures. The Tracy- Widom quan- 
tiles were computed thanks to the p2Num package, provided by C. Tracy. 

with the number of degrees of freedom. 

For the simulations given below, we have considered two distributions: 

- the Gaussian mixture ^AA(0, 1) -|- ^M{0, 3) 

- the Student's t— distribution with 40 degrees of freedom. 

We also considered various dimensions N, N = 10, 25, and 50, as well as 
different sample size to dimension ratios, = 2,4,50, and 100. 
For each size and each distribution, we have generated R = 10000 ran- 
dom matrices X with i.i.d. entries. For each replication, we have rescaled 
the largest eigenvalue of XX* /N as in ([5]) with ai = a2 = —0.5 (resp. 
CLi = a2 = 0) for the Student's t— (resp. Gaussian mixture) distribution. 
We then derived the estimated cumulative probabilities for A^'^ obtained 
from the 10000 replications. 

For small sizes, = 10 and p = 20 e.g, the proximity between the ob- 
served cumulative distribution of the rescaled largest eigenvalue and the 
Tracy- Widom distribution is reasonable essentially for upper quantiles for 
the t— distribution(95 %). For the Gaussian mixture, it is reasonable for 
smaller quantiles. As the size of the matrix increases, the proximity becomes 
acceptable for almost the whole range. We also note that the convergence is 
almost as good as for the Wishart ensemble (compare with Table 1 in [15]) 
in both cases. 



8 



Table 2: Student's t— distribution, 77V = 2,4 



Pc 


Fi 


10 X 20 


25 X 50 


50 X 100 


10 X 40 


25 X 100 


50 X 200 


-3.896 


.01 


.0024 


.0047 


.0069 


.0041 


.0052 


.0078 


-3.516 


.025 


.0097 


.0144 


.0188 


.0129 


.0162 


.0220 


-3.180 


.05 


.0262 


.0346 


.0392 


.0336 


.0350 


.0469 


-2.782 


.10 


.0674 


.0838 


.0857 


.0755 


.0842 


.0924 


-2.088 


.25 


.2203 


.2266 


.2375 


.2220 


.2317 


.2399 


-1.269 


.50 


.4863 


.4784 


.4950 


.4835 


.4890 


.4922 


-0.392 


.75 


.7457 


.7421 


.7487 


.7415 


.7433 


.7493 


0.450 


.90 


.8908 


.8950 


.9028 


.8951 


.8888 


.9003 


0.979 


.95 


.9440 


.9463 


.9501 


.9490 


.9421 


.9507 


1.454 


.975 


.9694 


.9730 


.9730 


.9727 


.9697 


.9735 


2.024 


.99 


.9872 


.9887 


.9896 


.9878 


.9880 


.9895 



The entries of the random matrices are i.i.d. with a distribution with 40 degrees 
of freedom and ai — a2 — —0.5. We have used the Matlab function trnd to generate 
the Student random variables. 



Figure 1: Probability plot of 10000 replications of X^'^ against ^((i — 
0.5)/10000). Left fig.: Student distribution with 40 degrees of freedom, 
N = 50, p = 200, R = 10000. Right fig.: Gaussian mixture, N = 50, 
p = 200, R = 10000. The largest eigenvalue is rescaled as in ([5]) with 
ai = 02 = (resp. ai = 02 = —0.5) for the Gaussian mixture (resp. 
Student's) distribution. 
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Table 3: Gaussian mixture, = 50, 100 



Pc- 


Fi 


10 X 500 


25 X 1250 


50 X 2500 


10 X 1000 


25 X 2500 


50 X 5000 


-3.896 


.01 


.0147 


.0114 


.0099 


.0180 


.0159 


.0113 


-3.516 


.025 


.0342 


.0276 


.0248 


.0377 


.0322 


.0260 


-3.180 


.05 


.0615 


.0553 


.0488 


.0643 


.0590 


.0501 


-2.782 


.10 


.1156 


.1051 


.1009 


.1148 


.1069 


.0977 


-2.088 


.25 


.2706 


.2473 


.2418 


.2663 


.2552 


.2420 


-1.269 


.50 


.5024 


.4906 


.4839 


.5049 


.4971 


.4792 


-0.392 


.75 


.7432 


.7375 


.7354 


.7450 


.7411 


.7284 


0.450 


.90 


.8899 


.8887 


.8906 


.8882 


.8884 


.8893 


0.979 


.95 


.9435 


.9434 


.9448 


.9393 


.9432 


.9425 


1.454 


.975 


.9694 


.9716 


.9706 


.9698 


.9698 


.9712 


2.024 


.99 


.9867 


.9876 


.9875 


.9875 


.9878 


.9883 



The entries are i.i.d. witli a Gaussian mixture distribution and ai = 02 = 0. 







Table 4: 


Student's t 


-distribut 


ion, 7Ar = 


50, 100 




Pc- 


Fi 


10 X 500 


25 X 1250 


50 X 2500 


10 X 1000 


25 X 2500 


50 X 5000 


-3.896 


.01 


.0094 


.0093 


.0096 


.0100 


.0099 


.0096 


-3.516 


.025 


.0224 


.0246 


.0221 


.0252 


.0221 


.0229 


-3.180 


.05 


.0470 


.0467 


.0479 


.0503 


.0466 


.0468 


-2.782 


.10 


.1006 


.0898 


.0961 


.0965 


.0936 


.0962 


-2.088 


.25 


.2415 


.2356 


.2422 


.2466 


.2412 


.2419 


-1.269 


.50 


.4938 


.4819 


.4901 


.4882 


.4893 


.4883 


-0.392 


.75 


.7418 


.7409 


.7456 


.7418 


.7400 


.7452 


0.450 


.90 


.8970 


.8928 


.8984 


.8953 


.8917 


.8979 


0.979 


.95 


.9468 


.9462 


.9469 


.9495 


.9480 


.9466 


1.454 


.975 


.9717 


.9747 


.9727 


.9743 


.9733 


.9715 


2.024 


.99 


.9889 


.9889 


.9885 


.9895 


.9902 


.9887 



The entries are i.i.d. with a t— distribution and ai = 02 = —0.5. 
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1.3 Sketch of the proof 



We here give the main ideas of the proof of both Theorems 11.31 and ll.5[ The 
proof fohows essentiahy the strategy introduced in [30j and we refer to this 
paper for most of the detail. We focus on the case where 7 = hmAr_+oo p/N < 
00. Basically we compute the leading term in the asymptotic expansion of 
expectations of traces of high powers of M^: 



E 



Tr 



N' 



-XX' 



sn 



(6) 



Here sat is a sequence such that there exists some constant c > with 
limAT^^oo ]^573 = c. It is indeed expected that the largest eigenvalues exhibit 
fluctuations in the scale N~'^/^ around n+ := cr'^{^/^N + 1)^- The core of the 
proof is to show that for large powers sat ~ N"^/^, for any integer K > 1, 
and any real numbers ti > 0,i = 1, . . . ,K, chosen in a compact interval of 



there exists C{K) > such that E H 



and 



iK 



1 \Nu+ 



< C{K) 



K 



K 
i=l 



o(l). (7) 



Formula ([7]) claims universality of moments of traces of powers of in 
the scale N'^^^. Using the machinery developed in [29j (Sections 2 and 5) 
and [30] (Section 2), we can then deduce that the limiting joint distribution 
of any fixed number of largest eigenvalues for sample covariance matrices 
of type (i) to (ii') (resp. {i') to {ii^')) is the same as for complex (resp. 
real) Wishart ensembles. Here we roughly give the main idea. On one 
hand, the Laplace transform of the joint distribution of a finite number of 
the rescaled eigenvalues /xj can be conveniently expressed in terms of joint 
moments of traces as in ([6]). On the other hand, the asymptotic distribution 
of these rescaled largest eigenvalues (and also the corresponding Laplace 
transform) is well-known in the Wishart setting. One can then deduce from 
universality of moments of traces that the asymptotic joint distribution of 
the largest eigenvalues for any ensemble considered here is the same as for 
the corresponding Wishart ensemble. The detail of the derivation of such 
a result from formula ([7]), including the required asymptotics of correlation 
functions for Wishart ensembles, can be found in [29], [30] and p]. The 
improvement we obtain with respect to [3D] is actually that Formula ([7]) 
holds for any value 7. Our result is due to a refinement in the counting 
procedure of [30] . 
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The paper is organized as follows. In Section [21 we introduce the so- 
called Narayana numbers. These numbers are the major combinatorial tools 
needed to adapt the computations of [30] to sample covariance matrices of 
any sample size to dimension ratio 7Ar. We also establish a central limit 
theorem for traces of high powers of Mj^j. Section [3] is simply a mimicking of 
the computations made in [30j and essentially yields formula ([7]). Finally, in 
Section m we consider the case where 7Ar oo, which requires some minor 
modifications. 

Acknowledgments I thank C. Tracy for the Mathematica code of the 
Tracy-Widom distribution, A. Soshnikov, D. Paul, N. Patterson, R. Cont, 
L. Choup and C. Semadeni for their great help in the improvement which 
lead to the final version of this paper. This work was done while visiting 
UC Davis. 

2 Combinatorics 

In this section, we define the combinatorial objects suitable for the com- 
putation of moments of the spectral measure of random sample covariance 
matrices. These combinatorial objects are the Narayana paths and are di- 
rectly related to the so-called Marchenko-Pastur distribution. Then, we give 
the basic technical estimates needed to compute the moments of traces of 
powers of sample covariance matrices. We illustrate our counting strategy 
by giving a refinement of the Marchenko-Pastur theorem and also obtain a 
Central Limit Theorem. 

2.1 Dyck paths and Narayana numbers 

Let Sat be some integer that may depend on N . Developing ([6]), we obtain 

E[TV {XXy] 

where ik G {1, 2, . . . , A^} and jk G {1, • • • ,ri,0 < A; < sat - 1. (9) 

In the whole paper, we denote by R the rule ([9]) for the choice of indices in 
([5]). We shall later prove that such a rule plays a fundamental role in the 
asymptotics of dH). To each term in the expectation ([8]), we associate three 
combinatorial objects that will be needed in the following. 
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First, to each term Xi^j^Xi^j^ ■ ■ ■ Xi^^_^j^^_^Xi^j^^_^ occuring in ([8]), we 
associate the following " edge path" Ve, formed with oriented edges (read 
from bottom to top) 




Due to the symmetry assumption on the entries of X, the sole paths leading 
to a non zero contribution in ([8]) are such that each oriented edge appears 
an even number of times. From now on, we consider only such even edge 
paths. 



To such an even edge path, we also associate a so-called Dyck path, which 
is a trajectory x(t),0 < t < 2sn, of a simple random walk on the positive 
half-lattice such that 

x(0) = 0, x{2sn) = 0; Vt G [0, 2sn], x{t) > and x{t) - x(t - 1) = ±1. 

We start the path at the origin and draw up steps (1, +1) and down steps 
(1,-1) as follows. We read successively the 2sn edges of (fTUj) . reading 
each edge from bottom to top. Then if the edge (oriented) is read for an 
odd number of times, we draw an up step. Otherwise we draw a down 
step. We obtain in this way a trajectory with sat up and sn down steps, 
which is clearly a Dyck path. We shall now estimate the number of possible 
trajectories associated to the edge path. Due to the constraint on the choices 
for vertices, we shall distinguish trajectories with respect to the number of 
up steps performed at an odd instant. Indeed, they are the moments of time 
where the vertices can be chosen in the set {1, . . . 

In the whole paper, we denote by k the number of up steps performed at an 
odd instant in a Dyck path. In particular, we denote by the trajectory 
associated to Ve-, where k is the number of its odd up steps. We also call 
^spi,k the set of Dyck paths of length 2s at with k odd up steps. 

Proposition 2.1. /5/ Let N(s7v,^) the so-called kth Narayana number 
defined by 

N{sN,k) = —C^^C^~\ (11) 
Sn 

Then N(sAr, k) = '^Xg^^k- 

Remark 2.1. For more details about Narayana numbers and their occur- 
rences in various combinatorial problems, we refer the reader to the work of 
Sulanke [32], [33| as well as Stanley [31] . 
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Narayana numbers are intimately linked with Dyck paths. Let D{2sm) 



'^^2sN^ be the Catalan number counting the number of Dyck paths of 



-1 

length 2sN- It is obvious that ^'fJli 1SI{sn, k) = D{2sn)- Narayana numbers 
are also linked to the moments of the Marchenko-Pastur distribution defined 
in ([1]), since the following was proved by Jonsson [T7] (see also [23J and [1]). 



Proposition 2.2. For any integer L, one has that 



lim — E 

N^oo N 



k=l 



a2^^7'^N(L,fc) = / x^dpMp(x). (12) 



Remark 2.2. Proposition 12. 21 was actually proved for a broader class of sam- 
ple covariance matrices than that considered in this paper. 



Last, we associate to the edge path Ve a "usual" path, which we denote 
by Pfc, as follows. We mark on the underlying trajectory x the successive 
vertices met in the edge path. The path Pk associated to (fTO]l is then 




Figure 2: The path with A; = 4, s^v = 8. 



The three structures VE^^k and Pk we have here introduced will now 
be used to compute the moments of traces of (high) powers of Mjv. Our 
counting strategy is as follows. Given the sublying Dyck path X^, we shall 
estimate the number of edge paths that can be associated to this Dyck path. 
We shall also estimate their contribution to the expectation ([8|). This is the 
object of the two next subsections. 
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2.2 Marked vertices 

In this subsection, we bring out the connection between Narayana paths and 
the restriction for the choices of vertices occuring in the path imposed by the 
rule R. Given a trajectory {x(t),t < 2s]\f} G '^SN,k, we shall now count the 
number of ways to mark the vertices using the rule R. In this way, we count 
the number of paths associated to a given trajectory. The terminology 
we use is close to the one used in [29], [27], [28] and [30j. We recall the 
main definitions that will be needed here and also assume that the reader is 
acquainted with most of the techniques used in the above papers. 

The first task is to choose the pairwise distinct vertices occuring in the 
path. There are at most p^^^^ such vertices. We shall now define "marked 
vertices", separating the cases where they are marked at an odd or even 
instant. 

Definition 2.1. An instant is said to be marked if it is the right endpoint 
of an up edge of the trajectory x. 

Marked instants correspond to the moments of time where, considering 
the top and bottom lines separately, one can possibly "discover" some vertex 
not already encountered. Consider first the vertices on the top line of the 
edge path Ve-, that is, vertices occuring at odd instants in the path P^. For 
< i < Sat, call % the class of vertices of {1, . . . ,p} occuring i times as a 
marked vertex at an odd instant. Then if we set pi = '^Ti, one has 



Note that each time we "discover" on the top line some new vertex, the 
corresponding instant is necessarily marked. Consider also the vertices on 
the bottom line. For < i < sn, denote by A/i the class of vertices of 
{1, . . . , A'^} occuring i times as a marked vertex at an even instant. Then 
one has, if nj = jJA/i, 



Note that a vertex from the set {1,2,..., A^} can occur as a marked vertex 
on both lines. Yet it is the type of the vertex on each line which is here 



p = 
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taken into account. Thanks to the above definition, we characterize a path 
Pk by its type 

{no,ni, . . .,ns^) {po,Pi, ■ ■ ■ ,Psn), with m = 0,Vi > sn - k,pi = OVi > k. 

For short, we denote by {n,p) the type of such a path. We also use the 
following notations. Any vertex v G Ui>2% (resp. v G Ui>2A/'i) is said to be 
a vertex of self-intersection on the top (resp. bottom) line. A vertex v e % 
(resp. V G Mi) is said to be of type i on the top (resp. bottom) line. 



The choice of marked vertices is enough to determine the distinct vertices 
of the path P^, if the origin of the path also occurs as a marked vertex. We 
will see that for typical paths, this is not the case and io G A/q. Thus, given 
the type {n,p) of the path, the number of ways to assign vertices at the 
marked instants and choose the origin is then at most: 



Indeed, one distributes the vertices of {!,..., A?^} and into the 

possible classes Mi,Ti,l < i < sat, choose the corresponding marked oc- 
currences of each vertex and fix the origin. Once the marked vertices and 
the origin of the path are chosen, there remains to fill in the blanks of the 
path Pk- Due to sclf-intcrscctions, there arc multiple ways to do so. We 
investigate this numbering in the sequel and consider at the same time the 
expectation of the filled path. 

2.3 Filling in the blanks of the path 

Consider now a path with k odd marked instants and of type {n,p). Call 
^max the maximum number of ways to fill in the blanks of P^ at the un- 
marked instants, once the marked instants and the origin are given. 

Proposition 2.3. Set ^max^max ■■= maX(jj^p) n^ax E 11^=0'^ ^ijij+i 

There exists C > independent of p, N, k and s jv such that 

9«,r SN — k k 
1=2 m=2 
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Proof of Proposition 12.31 We only sketch the proof which follows essen- 
tially the same steps as that of Lemma 1 in [28] . Assume that in , at the 
unmarked instant t, one makes a down step with left vertex i. If i is of type 
1, then there is no choice for the right endpoint of such an edge. In general, 
the maximal number of possible right endpoints depends on the multiplicity 
of i as a marked vertex. Knowing the parity of t, one also knows whether 
i is a vertex on the top or on the bottom line of Ve- Thus the sole top or 
bottom multiplicity of the vertex has to be taken into account to estimate 
the number of ways to close the edges. Thus, it is not hard to see that the 
number of ways to close the path is at most 



Sff — k k 

2 Jl (2/)'"' W (2 

1=2 



Ini "TT (c^^\mpm 



The extra factor 2 comes from the case (negligible) where the origin is of 
type 1. To consider simultaneously, the expectation of the path, one also 
has to take into account the number of times each oriented edge is read. 

Assume that an edge read 2q times, with q > 2. Call lu{v;w) 

(resp. lii{w;v)) the number of times v (resp. w) is a marked vertex of 
this edge. Then, if ld{w;v)lu{v;w) > 0, one has that E|M^^p'^ < {rq)'^ < 
{2Tluiv; w))'"^^'"'^ {2tI(i{'W] v))'"'^^'^'^^ . Now if an oriented edge is read 21 times 
then it is closed I times along the same edge. That is, we overcount the 
number of ways to close the path. Thus, if we let 2l{ij) be the number of 
times the oriented edge (ij) is read in the path, we obtain that 



£ n ^^ff^'iipo-ni^™) 



mpm 



l{ij)>l ^•'^ 1=2 

sjv — fc ^ k 

^ n (^0 " n 

1=2 m=2 

where Cr,C are some constants independent of p, k, N and sn- D 
Remark 2.3. In the case where the entries Xij have polynomial tails, with 
> x) < (1 + x)"^° for some nio > 36, one can first consider (up 
to a set of negligible probability) that all the entries of X/a are smaller in 
absolute value than Fjv := A^^/mo+e £qj. gome e > small enough. This is 
true if 7 < oo. Then Proposition 12.31 has to be replaced with 

O V S[^jk A; mp„i 



1=2 m=2 
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This follows from the fact that to each edge seen 21 > 2mo times there 
corresponds at least 1/2 marked occurences of one of its endpoints. 

In the two following subsections, we investigate moments of Traces of 
powers of Mat in scales sn << ^/N . This will give the foundations for the 
asymptotics of higher moments. 

2.4 Narayana numbers and the Marchenko-Pastur distribu- 
tion 

In this subsection, we illustrate our counting strategy and present a refine- 
ment of (|12p . which allows to consider higher moments than in Proposition 



Proposition 2.4. If « yN, one has that 



—E 

N 



TrM^^ 



k=l 



Proof of Proposition 12.41 The proof is similar to that of the classical 
Wigner theorem using Dyck paths (see e.g. pL]). It is divided into two steps. 
First, we show that paths for which X^j>2 '^i + Pi > yield a negligible 
contribution to E [TrM^^] . Then we estimate the contribution of paths with 
vertices of type 1 at most, which give the leading term of the asymptotic 
expansion of E [TrM^^] , as long as sn « y/N. 

Denote by Z{k, {n,p)) the contribution of paths with k odd marked in- 
stants and of type {h,p). Using Proposition 12.31 and (fT3]l . we deduce that 

Z{k, {n,p)) 



iV^Jv p^\no\pi\ ni\ ^_}-pi\ni\{i\)P'{i\)'^' 



1=2 ■ 



1=2 



where in the last line we have used that 7^ = 7^ and C > 

is a constant independent of N,p, k and sm (whose value may change from 
line to line). 
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We denote by Zi the contribution of paths for which X^j>2 ni + pi > 0. 
By Proposition 12.31 and using summation, one has that 



fc=l Mi,M2:Afi+Af2>0 

where Mi = X]j>2 -^2 = Si>2K- Thus, it is straightforward to see 
that there exists some constant B > independent of N such that 

f <i?f xa2-^7^N(.^,fc). (16) 
k=l 

From this, we can deduce that Z2/N = o(o-2''^(l + ^/t/v)^^^). 

We now show that only paths with vertices of type 1 (except the origin 
which is unmarked) have to be taken into account. In this case, once the 
vertices occuring in the path have been chosen, there is no choice for filhng 
in the blanks of the path P^. Furthermore, each edge is passed only twice 
in the path Ve, once at an odd instant and once at an even instant. Thus, 
denoting by Zi := X^^^^ Zi{k) the contribution of such paths, one has that 

Zi = ^ iVN(.;v, khW- n ^ = (1 + '^(l)) E ^N(.7V, k)jl,a'^-. 

k=l i=l k=l 

Using (jlOp . this finishes the proof that Z2 = o{l)Zi. The contribution of 
paths with marked origin and vertices of type 1 at most is of order Zisn /N 
and thus negligible. This finishes the proof of Proposition 12.41 □ 

Remark 2.4. Set k = [■^^^=sn] + '^- For any sequence sn » 1, one has that 
Ei<fc<sjv i^^^^rCsi^SAfO-^'''^ = 0{u^) and that the main contribution to 
the expectation ([6]) should come from paths with k{l + o(l)) odd marked 
instants. Indeed, using Stirling's formula, one has 



2 



(1 + VW)' 1 



rnax (c^) i'n - In - + Vl^f' ^ ^ ^- 

l<k<SM V / \ "V Sn^Jn ZTT 

It is also easy to check that, for any / > 0, one has that 

msN, k + Iht' < ^i^N, k)4 exp {- }, (17) 

(SN - k) 
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for some constant depending on 7 only. In the case where /< 0, we fix 
some A > large. Then one can show that, for any —A{sn — k + 1) < 

I < 0, N(^iV.fc-l+07^''+' < r_ Q ^ ^ f 

' N(sAr,fc-l)7^-i - (2A+2){sjv-fc+l)-' ^ 

I < -A(sN -k + 1), 1+^)7^-^+' < g-A{s^-fc)/3_ This ensures that 

Ei<fe<s, T^C^^C^-'^'^a^'^ = 0(nf ), yielding RemarkEl 

Remark 2.5. In the case of polynomial tails, (jl5p is multiplied by a factor 



n^>m./4r^^"'^'''^- If e < and sn = OiN^^), (which is the largest 

scale considered in this paper), this has no impact on computations as 

j^i^i << 1 for any i > mo/4:. All the results stated in the following 
can be proved in the case of polynomial tails up to minor technical modifi- 
cations (which amounts essentially to considering apart vertices of type at 
least mo/4). 

2.5 A Central Limit Theorem 

The main result of this subsection is the following Proposition. Set n+ = 
17^(1 + VT/v)^- show that all the moments of Tr(MAr/u+)'*^ are bounded 
and universal, as long as 1 << sat << \/iV- Assume that lim7v^oo7Af = 
7 < 00 and set Ip = 1/{(3tt) where (3 = 1 (resp. /? = 2) in the case where 
Mj\[ is real (resp. complex). 

Proposition 2.5. Assume that 1 << s^r << ^/N and set Mn = 4^- 

Then, there exists D > such that Varl TrM?/^ I < D, for any N, and 



lim Varl TvMJ^ I = Ip. Similarly, for any integer k, 

2k 



E 



TrAfJ - E[TrMj]J = {2k - 1)!! 1^1 + o(l)), 

- 2k+l 

TrM^^ - E[TrAf^^l = ofl). 



Remark 2.6. In [T7j, a Central Limit Theorem (CLT) is also established for 
traces of fixed (independent of A^) moments of M^- In this case, the limiting 
Gaussian distribution does depend on the fourth moment of the law of the 
entries. The above CLT is also stated but not proved in Remark 6 of [27j 
(a factor is missing) in the case where 7 = 1. 

Proof of Proposition 12.51 We only give the proof for the variance. The 
proof of the asymptotics for higher moments is a rewriting of pages 128-129 



20 



in [28] (also [9] Section 6) and is skipped. In the following, Ci, . . . , Cq, C'2, C3 
denote some positive constants independent of N . One has that 



Here, given an edge e = (^1,^2), X^. stands for X^-^^^ if e occurs at an odd 
instant of Ve or for X^^^^ if it occurs at an even instant. Now it is clear 
that the non zero terms in the above sum come from pairs of paths Ve^ 
V'^ sharing at least one oriented edge and such that each edge appears an 
even number of times in the union of the two paths. We say that such 
paths are correlated. To estimate the number of correlated paths and their 
contribution to the variance, we use the construction procedure defined in 
Section 3 of [27] . This construction associates a path of length 4sAr — 2 to a 
pair of correlated paths. 

Let Vi and V2 be two correlated paths of length 2s n- When reading the 
edges of Vi, let e denote the first oriented edge common to the two paths. 
Let also te and tg be the instants of the first occurrence of this edge in Vi 
and V2 ■ Then we are going to glue the two paths Vi and V2 , in such a way 
that we erase the two first occurrences of e in each of these paths. The glued 
path, denoted Vi V'P2j is obtained as follows. We first read Vi until we meet 
the left endpoint of e at the instant te- Then we switch to V2 as follows. 
Assume first that te and tg are of the same parity. We then read the path 
V2, starting from t'^, in the reverse direction to the origin and restart from 
the end of V2 until we come back to the instant tg + 1. If tg and t'^ are not 
of the same parity, we read the edges of V2 in the usual direction starting 
from ig + 1 and until we come back to the instant tg. We have then read 
all the edges of except the edge e occuring between tg and ig+i- We then 
read the end of Vi, starting from te + 1. Having done so, we obtain a path 
Vi V V2 which is of length Asn — 2. One can also note that the trajectory of 
Vi V V2 does not descent lower than the level x{te) during the time interval 
[te,te + 2s7v — 1]) by the definition of e and te- 

Now, to reconstruct the paths Vi and 1^2 from Vi^ V2, it is enough to 
determine the instant at which one has switched from one path to the other, 
the origin of the path V2 and the direction in which V2 is read. There are 
at most 4s tv ways to determine the origin and the direction once the instant 



Var (Tr M^^) 
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of switch is known. To estimate the number of preimages of a given path 
Vi V V2 of length 4s at — 2 and with k odd up steps, one has to give an upper 
bound for the number of instants te in Vi\/V2, which can be the instants of 
switch. To this aim, fix some tg G [0, 2s]\f — 1] and assume that the trajectory 
of Vi V V2 does not go below the level x(te) during an interval of time of 
length greater than or equal to 2s]\f — 1. Assume that x{te) > 0. Set then 

/ = inf{i > te, x{t) = x{te),x{t + 1) = x{te) - 1} - 2sn + 1. 

Denote by T2 the sub-trajectory in the interval [te,te + 2s7v — 1 + It is 
a Dyck path. Denote also by Ti the remaining part of the trajectory: it is 
also a Dyck path, along which the instant has been chosen. We denote 
by ki the number of the odd up steps of Ti. As the trajectory of Vi V V2 
is obtained by inserting T2 at the instant te in Ti, and using the fact that 
Vi V V2 and Vi U 1^2 have all the same edges but one, one can then deduce 
(see [27], p. 11-13, for the detail) that the contribution of correlated pairs 
is at most of order 

2^-^12.^1 JSijsN - ^-^^,k,)N{sN + '-^,k - ki) 

^ ^ ^ N(2s7v-l,/c) 

k=l 1=0 ki<kA2sN-l-l 

x{2sN - 1 - l)^^Zi{4sN - 2, k)+ 

(18) 

''^^-^ N(s,v - ki)NisN + ^, 57V + ^ + fci - fc) 



EE E 

fc=l 1=0 ki<kA2sN-l-l 

xi2sN - 1 - l)^^Zi{4sN -2,k). 



N{2sN - l,k) 



(19) 



Here Zi{4sn — 2,k) is the contribution of paths of length 4s n — 2 with k odd 
up steps to the expectation E[TrM^''^~^] and ([ISD (resp. (fTn|) ) corresponds 
to the case where te is even (resp. odd). The term {2s]\f — / — 1) in (fT8|) 
comes from the determination of te and where a'^/N = K{\Xe\'^)/N , if e is 
the edge erased from Vi \/ V2- It can indeed be shown that paths for which 
such an edge occurs also in Vi V V2 yield a contribution of order sat/A^ that 
of typical paths and are thus negligible. 

We first show that the variance is bounded. In the following, we set si{l) = 
Sat — ^ and S2{1) = S7V+^- Considering for instance ([H]), ( ([19]) is similar), 
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it is enough to prove that there exists a constant Ci > such that 

1=0 ki<kA2sM-l~l 

One can easily see that it is enough to consider the case where 2s]\[ — 1 — I > 
y'J/v- It is also straightforward by Remark 12.41 and Proposition 12.41 to see 



that one can choose < 2/3' < < 2/3" < 2 such that 

Zi{4sN - 2, k) « s^^iVu^'^"\ 

k<f3'sN or k>f3"sN 

This is enough to ensure that the contribution of correlated pairs such that 
the corresponding glued path has k odd up instants for some k < P'sn or 
k > P"sn is negligible in the large- A^- limit. We now set 

Then, / and k being fixed, / is maximal at ki = [fc ^^^^j';^^ ](+!). Fur- 
thermore, one can check that there exist constants C2,C2 > such that 
f{ki + j) < exp {— C2j^/A;i} for any j. From this we deduce that 

^ N(2.^-l,fc) (2-^-1-0 

ki<kA2sN-l~l 

< C,{2s^ - 1 - ^ N(2.^-l,fe) • 

It is now an easy consequence of Stirling's formula that 

2sAr-l 



<r rcy. 1 n3/2 N(^i(0,fei)N(g2(/),fc-A:i) 



(22) 



where = k/{2sj\f — 1). Using Proposition 12.41 one can also show that 
there exists C5 > such that 

V -—^ -Zi{AsN - 2,fc) ~ Cs-J'/^Nul'^-K 

(3'sM<k<f3"sM ^ ' 

Combining the whole yields that there exists a constant Cg such that (jlSp -|- 
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In the case where x{te) = 0, te is chosen amongst the returns to the level of 
the trajectory. It can be shown that the number of such instants is negligible 
with respect to y/s^ in typical paths. This follows from arguments already 
used above and in [27| p. 13. 

To compute the variance, we notice that in (|18p . the term {2s — 1 — I) can 
actually be replaced with si{l) — ki. Indeed as tg is even, the first step after 
tg + 2sAr — 1 + Hs a down step occuring at an odd instant. Also, there are 
only sn choices for the origin of V2, since one knows the parity of e in V2 
once the orientation of V2 is fixed. Then, using (|18p . (|19p . Remark 12.41 the 
exponential decay of f{ki), and Proposition 12. 4^ one can deduce that (for 
the real case) 

lim Var TrMff 

lim2s^l + — y 



^ E E 7^N(si(/), A;i)N(52(0, k - h) (23) 

k>l ki<k 

1 ^ 1 Sa (I) 

= lim 2sAr(l + — > EsUi)Es,Ai) 24 

In (IMI), we have set Ek = J V((l+v^)'-^K^-(^-^/^)! ) ^nd the equality 
follows from the fact that ([25]) is a Cauchy product. The value of li can be 
deduced from Formulas 4.7 in [30] and 3.6 in [27J. The computation of I2 
follows from the fact that, in the complex case, the occurences e in Vi and 
V2 cannot have the same parity (in typical paths). □ 



3 The case where 7n ^ 7, 1 < 7 < 00 

The aim of this section is to prove the following universality results. Let 
Mn = jjX^* be a sequence of sample covariance matrices satisfying (i) 
to {ih') (resp. (i') to {iv'))- Let K be some given integer and ci,...,Ci<- 
be constants chosen in some compact interval J C M"*" (independent of N). 

(i) s''' 

Consider sequences s)^ ,i = 1, . . . ,K, such that limjv^oo jpjs = Q- 
Theorem 3.1. Set u-^- = (t'^{1 + a/t/v)^- There exists a constant Ci = 
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Ci{K, J) > such that 



E 



E 



K 



2 = 1 

K 
i=l 



(i)' 



XX* \ 



< Ci and 



E 



K 



i=l 



/ XgX^ 

V Nu+ 



(1 + 0(1)). 



The proof of Theorem 13. II is the object of this section. We actuaUy focus 
on the case where K = 1. Indeed, the proof of Theorem 13.11 for K > 1 
is a rewriting of the arguments used in [29] (p. 41), Subsection 12.51 and of 
the arguments used in the case where K = 1. It is not developed further 
here. Then, we essentiaUy show that typical paths (i.e. those having a 
non negligible contribution to the expectation) have no oriented edge read 
more than twice. This ensures that the expectation ([6]) only depends on the 
variance of the entries Xij,i = 1, . . . ,N,j = 1, . . . ,p. Universality of the 
expectation then follows. 



3.1 Number of self-intersections and odd marked instants in 
typical paths 

We first give a technical Proposition which bounds the number of self inter- 
sections and give the approximate number of odd marked instants in typical 
paths. In the following, we denote by Z{k) the contribution of paths with 
k odd marked instants. We also denote the number of self-intersections on 
each line as Mi = X]j>2(^ — ^)ni and M2 = J2i>2i^ ~ ^)Pi- 

Proposition 3.1. There exists a positive constant di such that the contribu- 
tion of paths for which Mi + M2 > diy/sN is negligible in the large-N -limit, 
whatever 1 < k < sn is. 

And for any a, a' such that < a' < < a < 1, one has that 

Z{k)+ Z(fe)=o(l)nf. 

Proof of Proposition 13.11 We first give the proof of the first point of 
Proposition 13. 11 Denote by Z{k, {n,p)) the contribution of paths with k odd 
marked instants and of type {n,p). Using (jlSp . Remark 12.41 (and exactly the 
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same arguments as in ^29j p. 34), one can see that, for di large enough, 



J2 E z{k,{fi,p)) = o{i)u%-. 

k=l (r^,p)/X;^>2(*-l)("^+Pi)>'il^/^ 

We now turn to the second statement. Let then a and a' be chosen as in 
Proposition [Sril We assume that N is large enough so that a' < i^f^^j^ < 
We now show that Y2k>as ^i^) . Given any integer k < sn, and 

using (jlSp . one can show that there exists a constant > independent of 

and k such that Z{k) < N'y%exp{CsN'^^^}a^'"^'N{sN,k). Thus 

^ Z{k)< (r^'''^{sN,k)j%Nexp{CsN^^^} 
< a'^-NClci-'4e^p jcgiY^/^ - Cjs,, (a - ^^^^^ | 

for large enough. Similarly, the contribution of paths for which k < a'sN 
is negligible in the large- A^-limit. □ 



3.2 Asymptotics of E[TrM^^] 

In this subsection, we refine the estimate (llSp and in particular deal with 
vertices of type 2. Indeed, when summing (fT5|) over ni,i < sn andpi, i < sn, 
one can note that terms associated to vertices of type 2 make the summation 
go to infinity. To this aim, we shall control the number of vertices for which 
there is an ambiguity to continue the path at an unmarked instant. We 
shall also control the number of such vertices associated to edges passed 
four times or more. Finally, we shall also show that amongst vertices of 
type 3, none belongs to edges passed more than twice, while there are no 
more complex self- intersections in typical paths. 



From now on, given a's]\f < k < as^, we consider a path Pk of type {h,p) 
with Ml = Y.i>2i^ ~ ^ di^/s^ (resp. M2 = I]i>2(^ " '^)Pi ^ di^/sj^) 
self-intersections on the bottom (resp. top line). Our counting strategy is 
refined as follows. Knowing the moments of self- intersection, we first choose 
the vertices occuring at the remaining marked moments. One fills in the 
blanks of the path until the first moment of self-intersection is encountered. 
Then, one chooses the vertex which is repeated among the preceding ones 
(and repeat it if needed at the moment of second self-intersection and so 
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on). We then proceed in the same way for subsequent vertices. 

Assume that the instants of self-intersections have been chosen on each line: 

tjb,i < tjb,2 < • • • < tjb,n2 for vertices of type 2 on the bottom line, tju,i < 

3 1 3 1 

tju,2 < ■ ■ ■ < tju,p2 for vertices of type 2 on the top line, i^-^ ^ < t-'^ ^ < 

3 1 

■ ■ ■ < ^.-.h r, for the first repetition of a vertex of type 3 on the bottom line, 

^fbi ^ ^fbi ' ^fb2 > ^fb2 • ' ' ^he sccond repetition of a vertex of type 3 
on the bottom line. We do not go deeper in the list of instants since it 
is exactly the same as in |29j p. 724, except that we make a distinction 
between instants marked on the top or bottom line (even if it is the same 
vertex) . The number of pairwise distinct vertices in the order of appearance 
on each line (top or bottom) occuring in the path is at most 

SM-k-Mi k-Nh 

N n (N-i) Hip + l-i). (25) 

1=1 1=1 

Note that m ~ iV-iv-A^i-A^2+i7^-^^2 gxp{-^^ - 
First, we focus on vertices of type 2. In the general case, there are ju,i — i 
choices for the vertex occuring at the instant tju,i, since one chooses vertices 
occuring twice as marked instants. Assume first that the path is such that 
there are no choices for closing any edge from such vertices at unmarked 
instants and that none belongs to edges passed four times or more. Then 
choosing the instants and vertices of type 2 gives a contribution at most of 
order 

YlUu,i -i) Ylijb,i - i) 

^<tju,l<tju.2<--<tju,P2 <SN — ki=l l<tjb,l <tj6,2<'"<tjb,ri2 <k 1=1 

- ns! V 2 J p2\\2 J - 

Such an estimate combined with formula (|25|) and Remark 12.41 then ensures 
that the contribution of such paths to E[Tr(M7v/M+)*^] is bounded. 

Yet amongst vertices of type 2 on the bottom or on the top line, at an 
unmarked instant where one closes an edge with such a vertex as left end- 
point, there might be a choice for closing the edge. Note that there are at 
most three choices. An example of such a vertex is the distinguished vertex 
4 on Figure [21 as the distinguished edge could have been (4,4). Indeed, the 
two up edges with 4 as marked vertex on the top line are read before the 
first time a down edge is closed starting from 4. This leads to the notion of 
non-closed vertex. 
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Definition 3.1. A vertex v of type 2 is said to be non-MP-closed if it is an 
odd (resp. even) marked instant and if there are more than one choice for 
closing an edge at an unmarked instant starting from this vertex on the top 
(resp. bottom) line. 

Remark 3.1. The definition of non-MP-closed vertices differs from tliat of 
non-closed vertices in [29j, essentially due to the distinction which is made 
between the top and bottom lines. 

Let t be a given instant. Assume that the marked vertices before t have 
been chosen and that, at the instant t, there is a non-MP-closed vertex. 
Then, by the definition of the trajectory and of non-MP-closed vertices, 
there are at most x{t) possible choices for this vertex. This can be checked 
as in [.28J, p 122. In Lemma l3.ll below, we show that maxtx{t) ~ y/sN in 
typical paths. 

Apart from non-MP-closed vertices, a vertex of type 2 can also belong 
to an edge that is read four times in the path. To consider such vertices, 
we need to introduce other characteristics of the path. Let vn{P) be the 
maximal number of vertices that can be visited at marked instants from a 
given vertex of the path P. Let also T]\f{P) be the maximal type of a vertex 
in P. Then, if at the instant t, one reads for the second time an oriented up 
edge e, there are at most 2{vn{P) + Tn{P)) choices for the vertex occuring 
at the instant t. Indeed, one shall look among the oriented edges already 
encountered in the path and of which one endpoint is the vertex occuring at 
the instant t — 1 (see the Appendix in p8] and [9] Section 5.1.2 e.g.). It is an 
easy fact that paths for which T/v(-P) > ^A^^/'^(ln A^)~^ lead to a negligible 
contribution, if A is large enough (independently of k). We prove at the end 
of this subsection, using Lemma 13.21 stated below, that there exists e > 
small enough such that, for typical paths, 

t^n{P) < Sjv^ for any a sn < k < asN- 

For vertices of type i > 2, once the i — 1 moments of self- intersection are 
fixed, one chooses at the first moment of self-intersection the vertex to be 
repeated amongst those already occurred in the path. 

Assuming the above estimates on maxx(t) and i'n hold, we consider paths 
Pfe of type (n,p) with Mi := X]i>2(^ ~ 1)"-* ^ di^/s^ and M2 := I]j>2(^ " 
< diy/sN self-intersections respectively on the bottom or on the top 
line, Tj non-MP-closed vertices of type 2 (i = 1,2) on the bottom and top 
lines and qi {i = 1,2) vertices of type 2 on the bottom or top line, visited at 
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the second marked instant along an oriented edge already seen in the path. 
By Proposition 12.31 (|25p and the above, their contribution to E[TrM^] is 
then at most of order (see also [29], p725) 



1 ({sN-kfY'"'''^' 1 /4(sjv-A:)max x(t)^"^ 



{n2-ri-qi)\\ 2N J nl \ N 

J_ / D,{sN-k)il^N+TN) Y' yr ^ f C'{sn - kf ^ 
91 ! V N J nil V A^'-i 

1 fpy-'''-''^ I'fAkmaxxit)'^''' 



1 f Dik{uN + TN)y^ ri 1 fC'k'^^ 



V P J X^Pi- \P' 

Here D^,D4^ are positive constants independent of k,p,N, and sat, and E/j 
denotes the expectation with respect to the uniform distribution on ^SN,k- 
Here we have used the fact that Ylx<^x fc ■^(^) ~ N(sAr, /c)Efc(/(x)) for 
any function / > 0. Before considering paths in complete generality, we 
first restrict to paths with less than di^/sN self- intersections and no self- 
intersection of type strictly greater than 3: "^^y^Pi + = 0. 
Let Z^{k) denote the total contribution of paths with k odd marked instants 
such that (7 := (71 + 92 = 0, with no oriented edges read more than twice and 
satisfying the above conditions. 

Proposition 3.2. There exists a constant i?i > independent of N such 
thatZ3:=Et=a's^Zs{k)<B,ul-. 

Proof of Proposition 13.21 From ()26p . we deduce that there exists a con- 
stant Do > independent of N,p, k and sn, such that 



Zsik) < a''^N{sN, k)Nj%Ek (^exp {6 ^^}J exp {Doj^}. (27) 

In Lemma l3. II proved below, we show that, given a > 0, there exists 6 > 0, 

/ 1 a max x{t) -y \ 

independent of N, such that E^ ( e I < b, Va'sA? < k < as^. This 



yields that {27]) < D^a^'^NisN, k)N-t%, for some constant D5 independent 
of k. Remark Ea ensures that Z3 := j^k ^3(^) = 0«^). □ 
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Assuming that there are no self-intersections of type greater than 3, we 
can then show that paths for which g = + 92 ^ 1 give a contribution 
of order vn / y/sN and thus there are no edges read more than twice 
(associated to vertices of type 2). We then proceed in the same way to 
show that there are no more than In In N vertices of type 3 in typical paths 
and that there are no oriented edges read more than twice associated to 
vertices of type 3. It is then easy to deduce from the above result that paths 
with self-intersections of type 4 or greater, or a marked origin, lead to a 
contribution of order u^_^S]\f/N = o{l)u^_^ . The detail is skipped. 

Finally we investigate the total contribution of paths for which > 
•^jv^ where e > is fixed (small). Denote by .^4 such a contribution. We 
only indicate the tools needed to prove that 

ocsn 

since the detail of the proof is a rewriting of the arguments of the proof 
of Lemma 7.8 in [9]. To consider such paths, we introduce the following 
characteristic of the path, namely No := ri + r2 + X]i>3 ^'^j + '''Pi- Assume 
then that k, No, qi and (72 are given. We can then divide the interval [0, 2s n] 
into No sub-intervals, so that inside an interval, there are only closed vertices 
of type 2 or no self- intersection. Then there is no choice for closing the edges 
inside these sub-intervals. Assume that a vertex v is the starting point of vn 
up edges. Then, there is a time interval [si, S2] during which the trajectory 
of Pk comes v'j^ := times to the level Xq (of v) and never goes below. 
Denote by r(i^^) the event that there exists such an interval in a trajectory 
and let denote the uniform distribution on XsN,k- In Lemma [3. 21 we show 
that the probability of such an event decreases as 

max Fk(T{iy']^)) < Ais%exp{-A2i^N}, 

l<fc<sjv 

for some positive constants Ai,A2 independent of sat. Using Lemma 13. H 
one can also show that there exists a constant ^ > such that maxx(t) < 
AN^^^ in any non-negligible path. Using these estimates, formula (j26p . 
and Lemma 13.21 proved below, one can then copy the arguments used in ^ 
Lemma 7.8, to deduce that = o(l)u^^. 

This finishes the proof that typical paths have a non-marked origin, vertices 
of type 3 at most (and less than InlnA^ of type 3), less than dl^/sJl self- 
intersections and no edges read more than twice. The proof of Theorem 13. II 
is completed once Lemmas 13.11 and 13.21 are proved. □ 
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3.3 Technical Lemmas 



In this subsection, we prove the results used in the previous subsection 
on characteristics of typical paths. The first quantity of interest here is the 
maximum level reached by the trajectory of a path, namely maXigjo,2sjv] x(t). 
We shall show that it roughly behaves as y^sjv in typical paths. The second 
one is the maximal number of vertices visited from a given vertex, ^^{P), 
which should not grow faster than sj^ for any power 6 (but we get a weaker 
bound). 

We shall now prove the announced estimate for max^gp ^sj^j where 
x(t) denotes the level of the trajectory Xj^ associated to a path P = P^. Let a 
be some constant independent of N and k and denote by the expectation 
with respect to the uniform distribution P^. on the set of trajectories Xg^^^ 
of length 2s N with k up edges at odd instants. 

Lemma 3.1. Given a > 0, there exists 6 > independent of sn (and N) 
such that 

fa max 
max Efc exp < — — 

Proof of Lemma 13.11 It was proved in [28] that the above result holds if 
one replaces E^, with the expectation with respect to the uniform distribution 
on Dyck paths (no constraint on k) of length 2s n. We will call on this result 
to prove Lemma l3.ll To this aim, we cut the Dyck path Xk into 2— steps, 
so that there are 4 types of basic 2-steps : UU, UD, DD and DU {D stands 
here for down, U for up). It is an easy fact that the number of UU steps 
equals that of DD steps. Let then / be the number of UU steps (and DD 
steps), /c2 be those of DU and k^ be those of UD steps. Then, 

21 + k3 + k2 = SN, I + k2 = SN - k, l + k3 = k. (28) 

As a step UD oi DU brings the path to the same level, it is easy to see that 
the steps UU and DD are arranged in such a way that they form a Dyck 
path (if we identify a UU step with an up step and a DD step with a down 
step) of length 21. We denote Dy{Xk) this sub-Dyck path associated to the 
trajectory Xk (see Fig. [3]). 

We now explain how to build a trajectory X^ given Dy{Xk) of length 
21 and sn — k — I (resp. k — I) DU (resp. UD) steps. To construct Xk 
from Dy(Xk), one has to insert "horizontal" steps, namely DU and UD 
steps, in a particular way. Note that two distinct insertions lead to two 
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maxx(t)/2 



Figure 3: A trajectory and the associated trajectory Dy{Xk). 



different trajectories. The sole constraint is to insert steps DU wlien the 
path Dy{Xk) is at a level greater than or equal to one. This is the reason 
why we enumerate Dyck paths with 21 steps according to the number of 
times they come back to the level 0. Call '^Dyck{l, Q) the number of Dyck 
paths with 21 steps and Q returns to 0. We then have to insert s — k — l 
horizontal DU steps into 21 — Q boxes. Then we can insert the UD steps 
arbitrarily. This yields that 

kA{sM-k) I 

NisN,k)= Yl E (29) 

1=0 Q=0 

Prom this construction, it is easy to see that the maximum level reached 
by the trajectory Xk is twice the maximum (+1) reached by the sub-path 
Dy{Xi^). Let then q denote the set of Dyck paths of length 21 with Q 
returns to 0. We do not consider the degenerate case where Dy{Xk) = 0, 
which corresponds to the trajectory obtained with U D steps only. Then one 
has that 

Pfc(maxXfc(t) =r) 

•5jv/2 sjv 

{raa^Dy{Xk) = r/2\Dy{Xk) G yi,Q) {Dy{Xk) G yi,Q) + 

/=1 Q=0 
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^ ^ Pfc ( max Dy{Xk) = ^|Z)y(X,,) E 3^i,Q ) Pfc (Z)y(Xfc) € 3;j,q) . 
1=1 Q=0 ^ ^ 

(30) 

Let Fi^Q denote the uniform distribution on 3^;,q. Let also > (small) be 
given. It can easily be inferred from [25], p. 11 (see also and [U], Lemma 
7.10) that there exist positive constants ai, 02, independent of / and Q such 
that, if r' > UoVl, one has that 

Pi,Q(maxx(t) =/) < -^exp{-^}. (31) 

Thus, inserting ([3T|) in ([30]) . we deduce that there exist some positive con- 
stants 03,04 independent of k, sn and N such that, provided r > aoy/sN 
and for any u'sn <k< osn, Pfc(maxXfc(t) = r) < -^exp{-^}. This 
yields Lemma l3. 11 □ 

The second estimate we need is a suitable bound on VNiPk)- Recall that 
T{iy'j^) denotes the event that the trajectory of a path comes back from above 
v']^ times to some level Xq- 

Lemma 3.2. There exist positive constants Ai,A2 independent of k,N,p 
such that 

max Fk{r{u'j^)) < Ais%exp{-A2iyN}. (32) 

Proof of Lemma 13.21 Let [si, S2] be an interval such that x{ti) = x{t2) = 
Xq for some Xo > and x{t) > Xo,yt £ [51,^2] and for which there exists 
si < ii < ^2 < • • • < ^ly^ < ■S2 such that x{ti) = Xq- We first consider the 
case where si and S2 are even instants (then Xq is also even). Modifications 
to be made in the case where they are odd will be indicated at the end of the 
proof. The instants ti are then called instants of returns from above to Xq of 
the trajectory X^. Set now Yq to be the Dyck path of length S2 — si defined 
by yo{t) = x{t + si) — Xo, t £ [0, S2 — si]. Then the returns from above to Xq 
correspond to returns to of Yq. Now, the returns to of Yq can either be 
made using U D steps or correspond to a return of the sub-trajectory DyiYo) 
to this level. Thus, either the number of UD steps is large or the number of 
returns of DyiYo) to level is large. We shall show that in both cases, (f32l) 
holds. Thanks to Proposition 13.11 it is enough to consider trajectories X^^ 
for which q'stv ^ k < as^. The proof of Lemma 13.21 is divided into three 
steps. 
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Step 1 We first show that there exist positive constants 0^,0'^ indepen- 
dent of N and k such that, provided a'sN < k < asN, 

Pfc ( Xk has rjN consecutive UD steps ) < 0^3% exp {—Cgrj^} ■ (33) 

Assume that there exists a time interval [s'^jSg] with r]j\[ consecutive UD 
steps only. Given even instants s'^ and S2 (with S2 — s'l = 2r]j\f), the propor- 
tion of trajectories that have ry^r steps UD in [s'l, S2] is at most 

SN^VN ["^sm-vn) ^^^f k{k-l)---{k-r]N + l) \ ^ (j^^2m 



J_ (c^ Y \sn{sn - 1) • • • (sat - r/AT + 1), 

(34) 

for some constant Cg > 0. This readily yields (j33p . 



Step 2 We consider the case where the number of returns to level made 
by the associated path DyiYo) is large. It was proved in [28] that, if 
denotes the uniform distribution on the set of Dyck paths Y with length 2s, 
then there exist constants Cio,Cii independent of s such that 

Ps(3s']^, s'2 ■ Y has r]j\f returns from above to the level Xq in [s'l, s'2]) 
<Cios''exp{-CnVN}. (35) 

Denote by Q the number of sub-Dyck paths Yi,i = 1, . . . Q, of DyiYo) start- 
ing and ending at level 0. From the above result and ()29p . one can deduce 
that there exists constants C12, C13 > 0, independent of sat and fe, such that 

P{Q = Vn) < Ci2S% exp {-Ci37]n}- (36) 

Step 3 We can now turn to the proof of Lemma 13. 2[ A trajectory X^ 
coming back v'j^ times to the level Xq during [si,S2] can be described as 
follows. Denote by Q the number of sub-Dyck paths Yi,i = 1, . . .Q, going 
from level Xq to Xq and starting with aUU step and ending with a DD step. 
Denote by li,i = 1,...Q, the respective length of these sub-Dyck paths. 
Then these sub-Dyck paths are interspaced by — Q UD steps that split 
in at most Q + 1 sequences. Let uj^ (« < Q + 1) be the respective lengths 
of these disjoint sequences of UD steps from Xq to Xq. Using the estimates 
of Step 1 and Step 2, there exist constants Ci25^i3 > such that for any 
constants A,A'>0 (fixed later), 

Pfc (3 > v'nI^) < C[2S%exp{-C[^iy'N/A}, \fl<k<SN, 
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Pfc {Q > v'nI^) < C'nsjf exp {-C[-,u'j,/A'}, V 1 < < SAT. (37) 

Thus it is enough to consider trajectories such that A < Q < v'j^/A'. To 
count these trajectories, we study their structure in more detail. Set Lo = 
(s2 — si)/2 and let then ko < k he the number of odd marked instants 
of the sub-trajectory inside the interval [si,S2]. The remaining trajectory 
x{t), t G [0, 2siv] \ [•si,S2], is then a Dyck path of length 2sn — 2Lo with 
k — ko odd up steps. Given si and S2, and in order to count the number of 
such trajectories X^, we first rc-order the paths Yi and UD steps inside the 
interval [,si, S2] as follows. We first read the Dyck paths Yi,i = 1, . . . Q, and 
then read all the U D steps. 

Fix some < e < (1 — a)/2 (small). Assume first that 

Lo-ko<{l- e)sN + e{Lo - {v'n - Q)) - k. (38) 

The latter condition ensures, as k < asj\[ for some a < 1, that k — k^, + 
v'jq — Q < (1 — e){sN — Lo + u'jq — Q). Thus, we can apply Step 1 to the 
sub-trajectory obtained from x by erasing the sub-paths Yi,i = 1, ... ,(5. 
Then, given Q,ko-, si and S2, the number of trajectories of length L = 
2s N — 2Lo + 2(z^^ — Q) with k — ko + y'^ — Q odd up steps and that have 
v'j^ — Q UD steps between [L — (sn — •S2) — 2(z/^ — Q), L — {sn — S2)] is of 
order 

Cuexp{-C'o{iy'N - Q)} X N2, if N2 = l{X2s^-2Lo+2{u'^-Q),k-ko+u'^-Q}- 

Here C14, C'j arc some positive constants independent of S]\f,k and N. Note 
that the constant C'o depends only on e and a. Then, the number of Dyck 
paths of length Ylf=i h = 2i^o " 2(z/^ — Q), with ko — {i^'j^ — Q) odd up steps 
and coming back Q times to the level using DD steps is at most of order 

Ci5exp{-CQ} X iVi, if iVi = tt{'t'^Q^,,,,„_(.^_Q)}- 

As above Ci5,C" are positive constants independent of A^, sjv and k. 
Finally the number of ways to order the paths Yi and the UD steps inside 
the interval [si, §2] is equal to the number of ways to write v'j^ — Q as & sum 
of Q -I- 1 integers. There are C^, such ways. 

Thus the number of trajectories coming v'j^ times to some level Xo never 
falling below is at most, if Co = m.m{Co, C^}, 

E EE cS^Cieexp{-Coi^'r^}NiN2 

0<si<S2<sjv Q=A ko<k 
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<Cies% V exp{-Cou'^}N{sN,k), (39) 
Q=A 

since, Lo,Q,k and being fixed, X^fe^ -^1^2 < N(s7Vi^)- This yields the 
following estimate: 

Pfc ^Xfe has z/^ returns to 0, ^ < Q < < z/^s^e-^°''5vC^/vM'_ 

We can then choose A' large enough so that there exists a constant Cis > 0, 
independent of A^, k and sn, such that 

z.^e-^°'^^vc;;/j/^' < Ci8exp{-ai/^v/2}- 

This yields Lemma 13.21 if ()38p is satisfied. 

Assume now that (|38p is not satisfied. Then necessarily ko < {a + e)Lo < 
(a + l)Lo/2. Thus the number of trajectories Yq coming v'^q times to some 
level Xo with Q returns made using DD steps is at most 

C^,Ji{Lo - {y'^ -Q),ko- {ly'^ - Q), Q), 

where N(Lo — (i/^ — Q),ko — {v'j^ — Q),Q) is the number of Dyck paths of 
length 2Lo — 2(i^^ — Q), with Q returns to made only with DD steps and 
admitting ko — (z^^ — Q) odd up steps. From Step 2, one has that there 
exists a constant C20 (independent of ko, Lo) such that 

N{Lo-{iy'N-Q),ko-iiyN-Q),Q) < e^~'='^'''^^N{Lo-iiy'N-Q),ko-{iy'N-Q)). 

As ko < {a + e)Lo, there also exists C21 > (depending on e and a only) 
such that N(Lo - (z^^ -Q),ko- {v'j^ - Q)) < exp {-C2i{v'n - Q)}N(Lo, ko). 
The end of the proof is as above. This finishes the proof of Lemma 13.21 in 
the case where si and S2 are even. 

To consider the case where si and S2 are odd, one can then use exactly 
the same arguments as above, up to the following modifications. An odd 
marked instant of Yo simply defines an even marked instant of ■ Then it 
is an easy task to show that Step 1 holds if one replaces a with 1 — a' in 
([M|) . Step 2 and Step 3 can then be obtained using arguments as above. □ 

This finally completes the proof of Theorem 13. 1[ 
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4 The case where N ^ oo and N/p 

In this section, we consider the sequence of random sample covariance ma- 
trices Mp = ^XX* instead of M]\f. We also consider moments of traces of 

Mp to some powers sat ~ ^/t^vA^^/^. The above scaling readily comes from 
Theorem 11.41 proved by [6j. In particular, one has that 



EfTrM^^ 



1 



E TrM' 



N 



(40) 



We here prove the following universality result. Let K be some given integer 
and ci, . . . ,ck be positive constants chosen in a fixed compact interval J of 



Consider sequences s)^ ,i = 1, . . . ,K, such that limTv- 



a. 



Theorem 4.1. Let Va 



/7iv 



. There exists Ci (K) > such that 



E 



E 



K 



1=1 

K 



1 = 1 



XX* 
XX*'-' ^ 



< Ci{K) and 



E 



K 



i=l 



f XgX*(. 



(1 + 0(1)). 



The proof of Theorem 14.11 is the object of the whole section. We only 
consider the case where K = 1 and sat is some sequence such that 

lim — ^ ^ ,„ = c, for some real c > 0. (41) 

As in the preceding section, we establish that the typical paths have no edges 
read more than twice. This ensures that the leading term in the asymptotic 
expansion of E [TrM^'v] is the same as that for Wishart ensembles. The 
idea of the proof is very similar to that of the preceding section, but requires 
some minor modifications. This is essentially due to the discrepancy between 
marked vertices on the bottom and top lines, due to the fact that p » N. 
In this section, C, Cj, Di,Bi, i = 0, . . . ,9, denote some positive constants 
independent of N,p, k and sn whose value may vary from line to line (and 
from the preceding sections). 
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4.1 Typical paths 

We now state the counterpart of the second point in Proposition 13.11 in the 
two following Propositions. Let Zoofl = Zoofi{A) be the sub-sum corre- 
sponding to the contribution to (|4U|) of paths for which k < sn{1 — 
for some yl > to be fixed. 

Proposition 4.1. There exists A> such that Z^ofl = 0(1)^^^. 
Proof of Proposition 14.11 By (fTSj) . one has that 

sjv(l-^) 

T^SM-k + iK^^-) 

Now there exists a constant Ci > such that, for any I < k < sn, 

E ni(?^)"^-p{^i^^^/n. (43) 

Pi,l<i<k i>2 \ -f^ / 

Similarly, using the fact that J2i=i^'''^i = sn — k, we deduce that 

V 1'-^ ni,l<i<SN-k i>2 * ^ 

< 1 -rr - f £iv^^ '''' 

(l,\ sjv— 
"-^j , (44) 

as SAT - A; > Asn^n^^^ provided A > 2. Inserting dMD and (gSD in (ji2|) 
yields that 

^ SN SN -k+1 \ y^/yNSNj 



I 

(45) 
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We then deduce the following upper bound. For N large enough, one has 



A ,^/7^'^"' V(^iV-/c)2; \ SN J 

k<S]^(l ^ 

<C4^2..exp{C4ivV3} ^ l^^j =o(lK-, 



where in the last line we have chosen A > 4e^. This finishes the proof of 
Proposition 14. 1[ □ 

Given < e < 1/2, we also consider the contribution .^00,1 = -^00,1 (e) of 
paths for which A; > SAr(l — "^^)- 

Proposition 4.2. There exists < e < 1/2 such that Zoo,i = o{l)v^ . 

Proof of Proposition 14.21 By (fT5|) and as sn — k = 0{N'^/^), one has 

that 

Zo.,1 <2a^'^ Yl iVN(sJv,fc)exp{(Cl + C^)ivV3}^^(^^'-'=) 

< a^^^C',Nexp{C',N'/''}N{sN, [sn - + 1])7^ ^ 

< C!^N exp 

provided e < 1/2. This is enough to ensure Proposition 14.21 □ 

Set now Jtv = [sAr(l — -^^),SAr(l — -^^)]- Thanks to Proposition 14.11 
and Proposition 14.21 typical paths are such that k £ In. This implies in 
particular that i^n^ = 0(iVi/3) and ^ = 0{N^/^). Using the fact that 

Efce/jv o-^'''^N(sAr, k)Nj'^~''^ = 0(u+^), it is easy to deduce from that 
it is enough to consider paths for which Mi + M2 = X^j>2(^ ~ l)('^i + 
Pi) < diA^^/^ for some constant di independent k G 1^, N and p. For 
such paths, denote by Zoo (k) the contribution of paths with k odd marked 
instants. We now prove the following Proposition yielding Theorem 14.11 

Proposition 4.3. There exists Di > such that YlkeiN — ^1'^+' ■ 

Furthermore, the contribution of paths admitting either an edge read more 
than twice, or more than In In vertices of type 3, or a vertex of type 4 or 
greater, or a marked origin is negligible in the large-N -limit. 
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Remark 4.1. Considering paths Pk,k £ 1^, admitting only vertices of type 
2 at most and no non-MP-closed vertices, one can also deduce from the 
subsequent proof of Proposition 14.31 that there exists D2 > such that 



sjv 



Proof of Proposition |4T3] First, one can state the counterpart of Formula 
(126)1 . Due to the different scales SN-k = 0{N^/^), while k = 0{,fi^N^/^), 
we need in this section to distinguish vertices being the left endpoint of an up 
edge according to the parity of the corresponding instant. Define i'N,o{Pk) 
(resp. VN,e{Pk) to be the maximum number of vertices visited (at marked 
instants) from a vertex of the path occuring at odd instants (resp. even 
instants). Then, 



Zoc{k) < Ca2^^N(siv,A:)iV7^"'^e^ ^ 



"2,r-i ,91,713, ...,ns„_fc P2,r2,q2,P3,--;Pk 



{{SN - fc)V(2jV)) 

(712 - ri - qi)l 



n2-ri-qi 



1 fA{sN - k) max x{t)y^ 1 f Ds{sn - k){uN,o + Tn)"" 



ri! V N J qi\ V N 



{e/{2p)Y'-'''-''' 1 /4fcmax:E(t)V^ 1 [ D^k{uN,e + Tk^^ 



{P2 -r2- 92)! ^2! V P J 92! V P 

One still has that Tn < A"N^/^/lnN for some A" > in typical paths 
(independently of /c G In). This ensures that the analysis performed for the 
case where limAr^ooTAf < 00 can be copied, provided the counterparts of 
technical lemmas of Subsection 13.31 hold. Let a > be given. Assume for a 
while that typical paths are such that there exists e' > such that, VA; G In, 

( max3;(t) \ 
maxEfc expja -75 — } < 0, for some > , 

VN,o < iV^/^-^' and i^N,e < VinN^^^'"' . (48) 

The above statement will be proved in the subsequent subsection (Lemma 
14. II and Lemma [4. 2 p and using exactly the same arguments as in Lemma 7.8 
in [9|. We then copy the arguments of Proposition 13.21 and the sequel. Then 
it is easy to deduce that typical paths have a non-marked origin, vertices 
of type 3 at most (and a number of vertices of type 3 smaller than InlnA^) 
and no edge passed more than twice. The other paths lead to a negligible 
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contribution. We can also deduce that non-MP-closed vertices of type 2 as 
well as vertices of type 3 occur only on the bottom line in typical paths. 
In particular, let ^3,00 denote the contribution of paths for which qi+q2 = 0, 
r2 = 0, X^j>4?^^ + X]j>3Pi = 0, 713 < In In and no edges read more than 
twice. Then E [TrM^^] = ^3,00(1 + 0(1)) and 

This ensures that the limiting expectation depends only on the variance cr^ 
and has the same behavior as for Wishart ensembles. The proof of Theorem 
14.11 is complete, provided we prove the announced Lemmas. □ 



4.2 Technical Lemmas 

We now state the counterpart of Lemma 13.11 

Lemma 4.1. Given a > 0, there exists b > such that 

f maxx{t){sN - k) \ 
maxEfc expja — \ < b. 

(r max x{t)k -1 \ 
e p j — 1 << 

1. 



Proof of Lemma 14.11 The proof refers to the proof of Lemma 13.11 in 
Subsection 13.31 Let / be the number of UU steps of a trajectory Xfc,A; G 
In- From ([28]), one deduces that I < sn - k < A-^. Thus bv (l3T|) . we 

/ —1/2 

deduce that, if r > aoy Asn'Jj^ , and for any k G /at, Pfc(max = r) < 

2 

-^7^=^ exp { — -^^3^}. This readily proves Lemma HTTl □ 

One next turns to establish the counterpart of Lemma 13.21 We denote 
'^UMoiXk) ('^ssp. ^UMe{Xk)) event that the maximal number of times the 
trajectory comes from above (without falling below) to some level Xq at 
even instants (resp. odd instants) is h'iy,e (resp. i^n,o)- 
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Lemma 4.2. There exist positive constants Bi, B2, B^, B4, independent of 
N and p such that 



max Fk{T,^^^x,))<Bi^exp{-B2UN,o}-m 



sjv(l-^=)<A:<sjv(l-^^) ' IN 



Proof of Lemma 14.21 As in the preceding section, we have to estimate 
the probabihty that the trajectory comes to some level Xq i^'^ times without 
falling below in some time interval [si,S2]. Note that the two steps leading 
and starting at si (resp. S2) are up (resp. down) steps. This is because 
[si,S2] is a maximal interval. Thus the number of possible choices for si 
and S2 is at most of order sf^/jN, as /c E In- 

We first prove ([19|) and thus assume that si is odd. Then the returns to Xq 
occur at odd instants. The counterpart of formula ([Mil states 



^k\te[s[,s'.,] has only r]N UD steps ) < CU —— . (51) 



HN J 



The two steps preceding (and following) [s'j^jSg] are either both up steps or 
both down steps (regardless of the fact that s'^ , S2 are even or odd) . The 
estimate ([35]) still holds (up to the change — > sj^/j^) so that formula 
(|49]) is proved, copying the proof of Lemma [321 

We now turn to the proof of ([50p which is more involved than in Lemma 
13. 2[ Formula ([34ll translates to 

Pfc (Xk has UD steps in between [s'i,S2]) < Cgexpl— e }. 



Step 1 and Step 2 are then obtained as in Lemma [3T2] (with — > s'\j/^n)- 
From that, we can deduce that we can consider in Step 3 only the paths 
for which Ai < Q < Aov'^ j for some constants Ai,Ao > 0. We need 
to refine the estimate for Step 3. Let then [si,S2] be the interval where 
v'^ returns to some level Xo occur. We call Yq the trajectory defined by 
x{t) — Xo,t G [si,S2]. We then define ko to be its number of odd up steps, 
Q to be its number of returns to using DD steps, / (resp. /io, 1^0) to be 
its number of UU steps (resp. of DU steps and of UD steps occuring at 
some positive level). Assume that l,Q, fio^ko are given and observe that 
ko = I + y'j^ — Q + v'o- Let then 'Fi^Q^ko,iJ.o denote the conditional probability 
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on the event that Yq has ko odd up steps, fio DU steps and DyiYo) has 21 
steps and Q returns to 0. Then, one has that 

Yo has v'^-QUD steps at level ) < i^-T'^^'''^^ • 

One first shows that it is enough to consider the subpaths Yq such that 
^ = ^(1 + 0(1)) G [1 - e-S^'/',! - ei7^'^'] for some ei > smah 
enough. This follows from the fact that k € 1^ and arguments already used 
in Subsection 12.51 (see also (fT7|) ). This yields that 



N(^,MN(s^-^,A; 



- ^ ^ N(sjv,A;) 

Si<S2 ^feo_<-|_ -1^-1/2 ^ ^ 

<£kexp{-r?iV2/3}, 
for some r] > provided ^ > 2 A, where A has been fixed in Proposition 

— 1/2 

I4.1[ The analysis of the case where 2ko/{s2 — si)) > 1 — eiTjv similar. 
We can assume that i?4 is small enough to ensure that B^^sn < ^/InV^'^^^ ■ 
This yields (I50p and ensures that it is enough to consider the case where 
^ = A(i + o(l)) G [1 - _ Fixing / and fc^ we 

set Qt := (1 - iJ^T^M - As / < ^ - A^^ < one has that 



< ^21 < One can check that there exist a constant C„ independent 
of Si, S2, and / such that, for a given constant A2 > 4, 



— ^ ^ < Co exp {- — r-TTTT"}' if Q > Qa2 ■■= Qt(1 - VI2), 



^fc-.)/2 (A2 + 1)Qt- 



<Coexp{ ^} 7 ,ifQ<(5A2 



Thus it is clear that the proportion of paths coming back i^'j^ times from 
above to some level Xq and for which Q < Qri^ — ei) is at most of order 
4/7ivexp{-(^^|^}. Paths for which Q > Qt(1 - ei) > ^ are 
considered as in Step 2. This is enough to ensure (fSOl) . □ 
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Remark 4.3. The investigation of higher moments follows the same steps 
as in Subsection 12.51 In particular, considering Var(TrMp'^), only pairs of 
correlated paths such that "Pi V 1^2 has a number of odd up steps of order 

2sAr(l — 0(y^7^)) are non-negligible. In ([IS]), one can also replace the 
term {2sn — 1 — I) with sat — (1 + l)/2 — ki. Considering as above the 
exponential decay of (pOj) . one can also show that '}2ki<kA2sN-i-i fi^'^) — 
C'^{2sN - 1 - / - kiY'^fih) where h = Thus ([22]) can be 

replaced with 



^ C?,{SN 7, ^l)W 1= /(^l) < (^4 



1=0 



/Tjv (1 - anf 



where = k/{2sN — 1) ~ 1 — I/^/ttv- One can readily deduce from the 
above that the contribution of (jl9p is negligible. The case where x{te) = 
yields a negligible contribution, as readily seen from ([48]) and Lemma 
4.2i The latter is then enough to ensure that yar(TrMp^) is bounded and 
only depends on the variance of the entries. The investigation of higher 
moments is similar. 
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